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L8: Entry 1 of 2 



File: USPT 



Aug 26 # 2003 



US-PAT-NO: 6611840 

DOCUMENT- IDENTIFIER: US 6611840 Bl 

TITLE: Method and system for removing content entity object in a hierarchically 



structured content object stored 


in a database 




DATE- ISSUED: August 26, 2 003 






INVENTOR- INFORMATION : 






NAME 


CITY 


STATE ZIP CODE COUNTRY 


Baer; William J. 


San Jose 


CA 


Hanapole; Edward 


Pine Brook 


NJ 


Hartman, Jr.; Robert C. 


San Jose 


CA 


Hennessy; Richard D. 


York 


ME 


Johnson, Jr.; Eugene 


Lexington 


KY 


Kao; I -Ming 


San Jose 


CA 


Murray; Janet L. 


Los Gatos 


CA 


Robertson, III; Jerry D. 


San Jose 


CA 


Walkus; Richard W. 


Wayne 


NJ 


ASSIGNEE- INFORMATION: 






NAME 


CITY 


STATE ZIP CODE COUNTRY TYPE CODE 


International Business Machines 


Armonk 


NY 02 


Corporation 


Pearson Education, Inc. 


Upper Saddle 
River 


NJ 02 



APPL-NO: 09/ 489087 [PALM] 
DATE FILED: January 21, 2000 



PARENT -CASE: 

CROSS REFERENCE TO RELATED APPLICATIONS This application is related to the co-pending 
and commonly assigned patent applications listed below, which were filed herewith on 
Jan. 21, 2000 and are all incorporated by reference herein: Method and System for 
Adding User-Provided Content to a Content Object Stored in a Data Repository Ser. No. 
09/488,976 Method and System for Moving Content in a Content Object Stored in a Data 
Repository Ser. No. 09/488,971 is now pending Prerequisite Checking in a System for 
Creating Compilations of Content Ser. No. 09/488,9691 is now pending Method and System 
for Preventing Mutually Exclusive Content Entities Stored in a Data Repository to be 
Included in the Same Compilation of Content Ser. No. 09/489,265 is now pending Volume 
Management Method and System for a Compilation of Content Ser. No. 09/4 8 9,090 is now 
U.S. Pat. No. 6,669,627 Method and System for Calculating Cost of a Compilation of 
Content Ser. No. 09/489,143 is now pending Method and System for Storing Hierarchical 
Content Objects in a Data Repository Ser. No. 09/489,570 is now pending File Structure 
for Storing Content Objects in a Data Repository Ser. No. 09/489,730 is now pending 
Providing a Functional Layer for Facilitating Creation and Manipulation of 
Compilations of Content Ser. No. 09/489,605 is now pending A Hitmask for Querying 
Hierarchically Related Content Entities Ser. No. 09/489,133 is now pending A Method 
and Configurable Model for Storing Hierarchical Data in a Non-Hierarchical Data 
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Repository Ser. No. 09/489,561 is now pending 
INT-CL: [07] G06 F 17/30 



if 



US-CL-ISSUED: 707/102; 707/1, 707/100, 707/103, 707/104.1, 707/501.1, 707/513 
US-CL -CURRENT: 707/ 102 ; 707 /1, 707 / 100 , 707 / 104 .1 , 715 / 501.1 , 715 /513 

FIELD -OF -SEARCH : 707/1-10, 707/100-104.1, 707/501.1, 707/511-514, 707/907-908, 
345/739, 345/764, 345/760, 345/769, 345/826, 345/854, 345/660, 709/200-203, 
709/216-219, 705/10, 705/26-27 



PRIOR -ART -DISCLOSED : 



U.S. PATENT DOCUMENTS 



Search Selected 



Search ALL 





PAT -NO 


IS SUE -DATE 


PATENTEE -NAME 


US-CL 




o 


3964029 


June 1976 


Babb 


340/172 . 


5 


o 


4823306 


April 1989 


Barbie et al. 


364/900 




□ 


5251315 


October 1993 


Wang 


395/600 




□ 


5274757 


December 1993 


Miyoshi et al . 


395/146 




□ 


5297039 


March 1994 


Kanaegami et al . 


364/419. 


13 


□ 


5377348 


December 1994 


Lau et al . 


395/600 




o 


5388196 


February 1995 


Pajak et al . 


395/153 




o 


5579471 


November 1996 


Barber et al . 


395/326 




□ 


5664182 


September 1997 


Nierenberg et al . 


707/102 




□ 


5680619 


October 1997 


Gudmundson et al . 


395/701 




□ 


5758351 


May 1998 


Gibson et al . 


707/104 . 


1 


□ 


5778398 


July 1998 


Nagashima et al . 


707/501 




□ 


5781732 


July 1998 


Adams 


395/200. 


35 


□ 


5787413 


July 1998 


Kauf fman et al . 


707/2 




□ 


5806061 


September 1998 


Chaudhuri et al . 


707/3 




o 


5847709 


December 1998 


Card et al . 


345/775 




□ 


5848404 


December 1998 


Haf ner et al . 


707/3 




□ 


5848409 


December 1998 


Ann 


707/3 




o 


5857203 


January 1999 


Kauf fman 


707/200 




□ 


5890147 


March 1999 


Peltonen et al . 


707/1 




□ 


5956715 


September 1999 


Glasser et al . 


707/9 




□ 


5963940 


October 1999 


Liddy et al . 


704/9 




o 


5991756 


November 1999 


Wu 


707/3 




□ 


6134552 


October 2000 


Fritz et al. 


707/1 




□ 


6212530 


April 2001 


Kadlec 


707/201 




□ 


6243709 


June 2001 


Tung 


707/103R 


iri 


6449627 


Seotember 2002 


Baer et al . 


715/514 
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FOREIGN PATENT DOCUMENTS 



FOREIGN- PAT-NO 



PUBN-DATE 



COUNTRY 



US-CL 



W09932982 



63-286931 



November 198 8 
July 1999 



JP 



WO 



OTHER PUBLICATIONS 



IBM Digital Library "Application Programming Reference", Version 2", Second Edition 
(Sep. 1997) pp. 1147-1257. 

IBM Digital Library "Guide to Object-Oriented and Internet Application Programming", 
Version 2", Second Edition (Sep. 1997) , pp. 1-169. 

IBM Digital Library "Text Search Using TextMiner Application Programming Reference" 
First Edition, Sep. 1997. pp. 1-246. 

ART-UNIT: 2177 

PR I MARY -EXAMINER : Channava j j ala ; Srirama 
ATT Y- AGENT -FIRM: Foerster; Ingrid M. 



A web-based system, method and program product are provided for adding content to a 
content object stored (e.g., a custom compilation or prepublished work) in a data 
repository as a group of hierarchically related content entities. Each noncontainer 
content object is preferably stored as a separate entity in the data repository. Each 
content entity is also stored as a row in a digital library index class as a 
collection of attributes and references to related content entities and containers. As 
the user selects desired objects for inclusion in a content object, the system 
arranges the objects hierarchically, e.g., into volumes, chapters and sections 
according to the order specified by the user. The system then creates a file object 
(e.g., a CBO) defining the content object that contains a list or outline of the 
container and noncontainer entities selected, their identifiers, order and structure. 
This file object is stored separately in the data repository. Content is removed from 
the compilation by removing the container or noncontainer identifier from the list or 
outline. This is achieved through a user interface by providing a mechanism for 
enabling a user to select a container or noncontainer (e.g., by title) to be removed. 

3 9 Claims, 36 Drawing figures 



ABSTRACT : 
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L21: Entry 1 of 3 



File: EPAB 



Feb 5, 2003 



PUB-NO: EP001282060A2 

DOCUMENT- IDENTIFIER: EP 1282060 A2 

TITLE: System and method for providing place and price protection in a search result 
list generated by a computer network search engine 

PUBN-DATE: February 5, 20 03 



INVENTOR - INFORMATION : 

NAME COUNTRY 
CHEUNG, DOMINIC DOUGH -MING US 
SINGH, NARINDER PAL US 
SOULANILLE, THOMAS A US 
DAVIS, DARREN J US 

ASSIGNEE -INFORMATION : 

NAME COUNTRY 

OVERTURE SERVICES INC US 



APPL-NO: EP02255466 
APPL-DATE: August 5, 2002 

PRIORITY -DATA: US92202801A (August 3, 2001) 

INT-CL (IPC) : GQ& E 12/£G; GO 6 E 12/3J1 
EUR-CL (EPC) : G06F017/30 

ABSTRACT : 

CHG DATE=20030305 STATUS=0> A method and apparatus for managing search listings 
(344) in a search database (38) include storing one or more search listings for an 
advertiser. Each Rpamh listing includes an associated search term (352) . The system 
receives from the advertiser identification information for a search listing and a 
desired rank for the identified search listing, a maximum cost per click for the 
search listing, or both. The system stores the desired rank and/or maximum cost per 
rlick for the search listing. The system then determines a cost per click for the 
identified sparrh listing based on the desired -rank and nhhp.r R^arrh listings which 

include the -qparrh i-pr-m associated with the identified search listing. S.J 
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L21: Entry 2 of 3 File: DWPI Nov 14, 2002 



DERWENT-ACC-NO: 2003-199412 
DERWENT-WEEK : 2 003 74 

COPYRIGHT 2003 DERWENT INFORMATION LTD 

TITLE: Internet based database search apparatus includes search engine to search 
database in which search list, search items, bid rank and desired rank for 
advertisers are stored 

INVENTOR: CHEUNG, D D;. DAVIS, D J ; SINGH, N P ; SOULANILLE , T A ; DOUGH, M C D ; 
PAL, S N 

PATENT-ASSIGNEE: OVERTURE SERVICES INC (OVERN) , OVERTURE SERVICE CORP (OVERN) , 
CHEUNG D D (CHEUI) , DAVIS D J (DAVII) , SINGH N P (SINGI) , SOULANILLE T A (SOULI) 

PRIORITY -DATA: 200 1US-0922028 (August 3, 2001), 1999US-0322677 (May 28, 1999), 
2001US-0911674 (July 24, 2001) 



PATENT -FAMILY: 



PUE 


i-NO 


PUB -DATE 


LANGUAGE 


PAGES 


MAIN-IPC 


US 


20020169760 Al 


November 14, 2002 




073 


G06F007/00 


EP 


1282060 A2 


February 5, 2003 


E 


000 


G06F017/60 


FR 


2828310 Al 


February 7, 2003 




000 


G06F017/60 


DE 


10235429 Al 


March 20, 2003 




000 


G06F017/60 


CA 


2396501 Al 


February 3, 2003 


E 


000 


G06F017/30 


WO 


2003014865 A2 


February 20, 2003 


E 


000 


G06F000/00 


GB 


2381345 A 


April 30, 2003 




000 


G06F017/60 


KR 


2003013333 A 


February 14, 2 003 




000 


G06F017/30 


CN 


1407487 A 


April 2, 2003 




000 


G06F017/30 


JP 


2003233684 A 


August 22, 2003 




195 


G06F017/60 



DESIGNATED -STATES : AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LI LT LU LV MC 
MK NL PT RO SE SI SK TR AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ 
DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR 
LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM 
TN TR TT TZ UA UG UZ VN YU ZA ZM ZW AT BE BG CH CY CZ DE DK EA EE ES FI FR GB GH GM 
GR IE IT KE LS LU MC MW MZ NL OA PT SD SE SK SL SZ TR TZ UG ZM ZW 
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PUB -NO 


APPL-DATE 




APPL-NO — 


DESCRIPTOR 




May 28, 1999 


19 99US-0322677 




TJS20 02 016 976 0A1 


July 24, 


2001 


2001US-0911674 


CIP of 


TT32 0 02 016 9760A1 


August 3 , 


2001 


2001US- 0922028 




TT C 3?0 02 016 9760A1 






US 6269361 


Cont of 


Cttr j.^ o ^ u u \j c\zj 


August 5 , 


2002 


2 0 02EP-02 55466 




PR 2ft2R^10Al 


August 2 , 


2002 


2 002FR- 0009909 






August 2 , 


2002 


2 0 02 DE -10*55429 






August 1, 


2002 


2 0 02CA-2'* 96501 




WO9 0 0^01 4 86^A2 


July 24, 


2002 


2 0 02WO-US23 502 




VXD ZjOlJl 3n 


August 5, 


2002 


2 0 0 2HB - 0 0 1 ft 1 "3 2 




KR2003013333A 


August 3, 


2002 


2002KR-0045944 




CN 1407487A 


August 3, 


2002 


2002CN-0147281 




JP2003233684A 


August 2, 


2002 


2002JP-0260581 





INT-CL (IPC) : O0£ E J1/.0H; G£L£ E Z/Ail; G06 Z 12/211; G0£ E 12/fiG; E HOA L 

12/16 

RELATED-ACC-NO: 2 001 -327720 ; 2002 - 048793 ;2002-105680 ;2003-120213 ;2003-168005 
,-2003-203316 ;2003-362948 ;2003-710850 

ABSTRACTED -PUB -NO: US2 0020169760A 
BASIC-ABSTRACT: 

NOVELTY - A search engine searrhfts a database comprising a search list which 
includes search term specified by the advertiser and bid rank associated with the 
search term . The bid rank includes maximum cost per click chargeable to the 
advertiser and rank desired by the advertiser. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for the following: 

(1) Method for managing search listing in database; 

(2) System for managing search listing in database; 

(3) Method of generating a search result list; 

(4) Method of enabling network information provider to update information; 

(5) Method of determining cost per click and search listings to be associated with 
each rank position of a search result display; and 

(6) Database search system. 

USE - Internet based database search apparatus. 

ADVANTAGE - Reduces workload on advertisers to maintain economic position by 
detecting cost per cl ick (c/pc ) for search items and notifying CPC to advertisers 
based on interaction of users with the search items over the internet. 



DESCRIPTION OF DRAWING (S) - The figure shows the chart of menus, display screens and 
input screens in the database search apparatus. 



ABSTRACTED- PUB -NO: US20020169760A 
EQUIVALENT-ABSTRACTS : 

CHOSEN-DRAWING: Dwg.2/39 



DERWENT- CLASS: P85 T01 

EPI-CODES: T01-J05B3; T01-J05B4; T01-N01A; 
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File: DWPI 



Apr 1, 2003 



DERWENT-ACC-NO: 2000-023076 
DERWENT-WEEK: 200366 

COPYRIGHT 2003 DERWENT INFORMATION LTD 

TITLE: Document ranking method for retrieving hypertext documents in web site 
INVENTOR: CHAKRABARTI, S; DOM, B E 

PATENT-ASSIGNEE: IBM CORP (IBMC), IBM UK LTD (IBMC), INT BUSINESS MACHINES CORP 
(IBMC) 



PRIORITY -DATA: 1998US- 0058635 (April 10, 1998) 



PATENT -FAMILY: 
PUB -NO 
TW 526432 A 
WO 9953418 Al 
US 6125361 A 
EP 1070296 Al 
CN 1296589 A 



PUB -DATE 
April 1, 2003 
October 21, 1999 
September 26, 2 000 
January 24, 2001 
May 23, 2001 



LANGUAGE 



PAGES 

000 

027 

000 

000 

000 



MAIN- IPC 

G06F017/30 

G06F017/30 

G06F017/30 

G06F017/30 

G06F017/30 



DESIGNATED -STATES: CA CN JP KR PL AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT 
SE DE FR GB IE IT NL 



APPLICATION-DATA: 
PUB -NO 
TW 526432A 
WO 9953418A1 
US 6125361A 
EP 1070296A1 
EP 1070296A1 
EP 1070296A1 
CN 1296589A 



APPL-DATE 
February 9, 1999 

March 12, 1999 

April 10, 1998 

March 12, 1999 

March 12, 1999 

March 12, 1999 



APPL-NO 

1999TW-0101973 

1999WO-GB00752 

1998US-0058635 

1999EP-0907779 

1999WO-GB00752 

WO 9953418 

1999CN-0804913 



DESCRIPTOR 



Based on 



INT-CL ( IPC) : GQ£. E UZ/iQ 



ABSTRACTED - PUB -NO : 
BASIC -ABSTRACT: 



US 6125361A 



NOVELTY - The reference to a second document in a first document is identified and 
lexical distance which defines document terms is received. The query terms are 
received and number times query terms present in first document within the lexical 
distance of reference of the second document is determined for -ranking the 
documents. . . - 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for method for finding 
associations in computer stored documents between document terms and query topics . 

USE - For retrieving hypertext documents in web site and also other documents such 
as patents, academic papers, articles, books, E-mails, etc. 
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ADVANTAGE - Finds association in stored documents between documents terms and query- 
topics represented by query terms and improves web searching, thus easy to use and 
is nnflt effective. ■ - 

DESCRIPTION OF DRAWING ( S ) - The figure shows flowchart representing logic for 
growing list of web sites in response to query. 

ABSTRACTED- PUB -NO: WO 9953418A 
EQUIVALENT-ABSTRACTS : 

NOVELTY - The reference to a second document in a first document is identified and 
lexical distance which defines document terms is received. The query terms are 
received and number times query terms present in first document within the lexical 
distance of reference of the second document is determined for ranking the 
documents . 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for method for finding 
associations in computer stored documents between document terms and query topics. 

USE - For retrieving hypertext documents in web site and also other documents such 
as patents, academic papers, articles, books, E-mails, etc. 

ADVANTAGE - Finds association in stored documents between documents terms and query 
topics represented by query terms and improves web searching, thus easy to use and 
is mat effective. 

DESCRIPTION OF DRAWING (S) - The figure shows flowchart representing logic for 
growing list of web sites in response to query. 

CHOSEN -DRAWING: Dwg.3/6 

DERWENT- CLASS: T01 

EPI-CODES: T01-E01A; T01-H07C3C; T01-H07C5E; T01-J05B1; T01-J05B3; 
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L25: Entry 26 of 30 



File: USPT 



May 19, 1998 



DOCUMENT- IDENTIFIER: US 5754 939 A 

TITLE: System for generation of user profiles for a system for customized electronic 
identification of desirable objects 



ahfii-rarf TPvh (1) : 

This invention relates to customized electronic identification of desirable objects, 
such as news articles, in an electronic media environment, and in particular to a 
system that automatically constructs both a "target profile" for each target object 
in the electronic media based, for example, on the frequency with which each word 
appears in an article relative to its overall frequency of use in all articles, as 
well as a "target profile interest summary" for each user, which target profile 
interest summary describes the user's interest level in various types of target 
objects. The system then evaluates the target profiles against the users' target 
profile interest summaries to generate a user-customized rank ordered listing of 
target objects most likely to be of interest to each user so that the user can 
select from among these potentially relevant target objects, which were 
automatically selected by this system from the plethora of target objects that are 
profiled on the electronic media. Users' target profile interest summaries can be 
used to efficiently organize the distribution of information in a large scale system 
consisting of many users interconnected by means of a communication network. 
Additionally, a cryptographically-based pseudonym proxy server is provided to ensure 
the privacy of a user's target profile interest summary, by giving the user control 
over the ability of third parties to access this summary and to identify or contact 
the user. 

Brief Summary Text (2) : 

This invention relates to customized electronic identification of desirable objects, 
such as news articles, in an electronic media environment, and in particular to a 
system that automatically constructs both a "target profile" for each target object 
in the electronic media based, for example, on the frequency with which each word 
appears in an article relative to its overall frequency of use in all articles, as 
well as a "target profile interest summary" for each user, which target profile 
interest summary describes the user's interest level in various types of tar get 
objects. The system then evaluates the target profiles against the users' target 
profile interest summaries to generate a user-customized rank ordered listing of 
target objects most likely to be of interest to each user so that the user can 
select from among these potentially relevant target objects, which were 
automatically selected by this system from the plethora of target objects that are 
profiled on the electronic media. Users' target profile interest summaries can be 
used to efficiently organize the distribution of information in a large scale system 
consisting of many users interconnected by means of a communication network. 
Additionally, a cryptographically based proxy server is provided to ensure the 
privacy of a user's target profile interest summary, by giving the user control over 
the ability of third parties to access this summary and to identify or contact the 
user. 

Rr-iRf Su mmary Tpvh (15) : 

Relevant definitions of terms for the purpose of this description include: (a.) an 
object available for access by the user, which may be either physical or electronic 
in nature, is termed a "target object", (b.) a digitally represented profile 
indicating that target object's attributes is termed a "target profile", (c.) the 
user looking for the target object is termed a "user", (d.) a profile holding that 
user's attributes, including age/zip code/etc. is termed a "user profile", (e.) a 
summary of digital profiles of target objects that a user likes and/or dislikes, is 
termed the "target profile interest summary" of that user, (f) a profile consisting 
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of a collection of attributes, such that a user likes target objects whose profiles 
are similar to this collection of attributes, is termed a "search profile" or in 
some contexts a "query" or "query profile," (g.) a specific embodiment of the target 
profile interest summary which comprises a set of search prof iles is termed the 
"search profile set" of a user, (h.) a collection of target objects with similar 
profiles, is termed a "cluster," (i.) an aggregate profile formed by averaging the 
attributes of all target objects in a cluster, termed a "cluster profile," ( j . ) 
areal number determined by calculating the statistical variance of the profiles of 
all target objects in a cluster, is termed a "cluster variance," (k.) a real number 
determined by calculating the maximum distance between the profiles of any two 
target objects in a cluster, is termed a "cluster diameter." 

Rr-ipf Summary Tp.xt (16) : 

The system for electronic identification of desirable objects of the present 
invention automatically constructs both a target profile for each target object in 
the electronic media based, for example, on the frequency with which each word 
appears in an article relative to its overall frequency of use in all articles, as 
well as a "target profile interest summary" for each user, which target profile 
interest summary describes the user's interest level in various types of target 
objects. The system then evaluates the target profiles against the users' target 
profile interest summaries to generate a user-customized rank ordered listing of tar 
get objects most likely to be of interest to each user so that the user can select 
from among these potentially relevant target objects, which were automatically 
selected by this system from the plethora of target objects available on the 
electronic media. 

Rv-i^f Su mmary TftYh (19) : 

The preferred embodiment of the system for customized electronic identification of 
desirable objects operates in an electronic media environment for accessing these 
target objects, which may be news, electronic mail, other published documents, or 
product descriptions. The system in its broadest construction comprises three 
conceptual modules, which may be separate entities distributed across many 
implementing systems, or combined into a lesser subset of physical entities. The 
specific embodiment of this system disclosed herein illustrates the use of a first 
module which automatically constructs a "target profile" for each target object in 
the electronic media based on various descriptive attributes of the target object. A 
second module uses interest feedback from users to construct a "target profile 
interest summary" for each user, for example in the form of a "search profile set" 
consisting of a plurality of search profiles, each of which corresponds to a single 
topic of high interest for the user. The system further includes a profile 
processing module which estimates each user's interest in various target objects by 
reference to the users' target profile interest summaries, for example by comparing 
the target profiles of these target objects against the search profiles* in users' 
search profile sets, and generates for each user a customized rank-ordered listing 
of target objects most likely to be of interest to that user. Each user's target 
profile interest summary is automatically updated on a continuing basis to reflect 
the user's changing interests. 

Detail fid Dfiarript-i on Tfixt (69) : 

Instead of breaking the text into its component words, one could alternatively break 
the text into overlapping word bigrams (sequences of 2 adjacent words), or more 
generally, word n-grams. These word n-grams may be scored in the same way as 
individual words. Another possibility is to use character n-grams. For example, this 
sentence contains a sequence of overlapping character 5 -grams which starts "for e", 
"or ex", "r exa" , "exam", "examp", etc. The sentence may be characterized, 
imprecisely but usefully, by the score of each possible character 5-gram ("aaaaa", 
"aaaab", . . . "zzzzz") in the sentence. Conceptually speaking, in the character 
5-gram case, the textual attribute would be decomposed into at least 26. sup. 5 
=11,881,376 numeric attributes. Of course, for a given target object, most of these 
numeric attributes have values of 0, since most 5-grams do not appear in the target 
object attributes. These zero values need not be stored anywhere. For purposes of 
digital storage, the value of a textual attribute could be characterized by storing 
the set of character 5-grams that actually do appear in the text, together with the 
nonzero score of each one. Any 5 -gram that is not included in the set can be assumed 
to have a score of zero. The decomposition of textual attributes is not limited to 
attributes whose values are expected to be long texts. A simple, one-term textual 
attribute can be replaced by a collection of numeric attributes in exactly the same 
way. Consider again the case where the target objects are movies. The "name of 
director" attribute, which is textual, can be replaced by numeric attributes giving 



2 of 4 



11/28/03 5:18 



^^(^Llini, " "Woody-Allen, " "Teren^^^a 



the scores for "Federico-Fellini , " " Woody- Al len , " "Terence^Davies, " and so forth, in 
that attribute. For these one- term textual attributes, the score of a word is 
usually defined to be its rate in the text, without any consideration of global 
frequency. Note that under these conditions,, one of the scores is 1> while the other 
scores are 0 and need not be stored. For example, if Davies did direct the film, 
then it is "Terence -Davies" whose score is 1, since "Terence -Davies" constitutes 
100% of the words in the textual value of the "name of director" attribute. It might 
seem that nothing has been gained over simply regarding the textual attribute as 
having the string value " Terence -Davies . " However, the trick of decomposing every 
non-numeric attribute into a collection of numeric attributes proves useful for the 
clustering and decision tree methods described later, which require the attribute 
values of different objects to be averaged and/or ordinally ranked . Only numeric 
attributes can be averaged or ranked in this way. 

nfit-a-ilfid Descr-ipti on Text (117) : 

In step 5 of this pseudo-code, smaller thresholds are typically used at lower levels 
of the tree, for example by making the threshold an affine function or other 
function of the cluster variance or cluster diameter of the cluster p.sub.i. If the 
cluster tree is distributed across a plurality of servers, as described in the 
section of this description titled "MeJiwork Context of the Browsing System", this 
process may be executed in distributed fashion as follows: steps 3-7 are executed by 
the server that stores the root node of hierarchical cluster tree T, and the 
recursion in step 7 to a subcluster tree T.sub.i involves the transmission of a 
Rparch request to the server that stores the root node of tree T.sub.i, which server 
carries out the recursive step upon receipt of this request. Steps l-2are carried 
out by the processor that initiates the search, and the server that executes step 6 
must send a message identifying the target object to this initiating processor, 
which adds it to the list. 

Detailed Description Text (242) : 

Algorithms for constructing multicast trees have either been ad-hoc, as is the case 
of the Deering, et al. Internet multicast tree, which adds clients as they request 
service by grafting them into the existing tree, or by construction of a minimum 
mat, spanning tree. A distributed algorithm for creating a spanning tree (defined as 
a tree that connects, or "spans," all nodes of the graph) on a set of Ethernet 
bridges was developed by Radia Perlman ("Interconnections: Bridges and Routers," 
Radia Perlman, Addison -Wesley, 1992) . Creating a minimal -cost spanning tree for a 
graph depends on having a coat, model for the arcs of the graph (corresponding to 
communications 1 inks in the communications network) . In the case of Ethernet 
bridges, the default cost (more complicated costing models for path costs are 
discussed on pp. 72-73 of Perlman) is calculated as a simple distance measure to the 
root; thus the spanning tree minimi7ps the cost to the root by first electing a 
unique root and then constructing a spanning tree based on the distances from the 
root. In this algorithm, the root is elected by recourse to a numeric ID contained 
in "configuration messages": the server w hose ID has minimum numeric value is 
chosen as the root. Several problems exist with this algorithm in general. First, 
the method of using an ID does not necessarily select the best root for the nodes 
interconnected in the tree. Second, the cost model is simplistic. 

Pet-ailed Dfiarription Text (250) : 

In another variation, where target profile interest summaries are embodied as search 
profile sets, the following procedure is followed to compute w(Si, C) : (a) . For each 
search profile P.sub.s. in the locally stored search profile set of any user in the 
user base of proxy server Si, proxy server Si computes the distance d( P.sub.s, 
P.sub.c) between the search profile and the cluster profile P.sub.c of cluster C. 
(b) . w(SiC) is chosen to be the maximum value of (-d(P. sub. s, P. sub.c) /r) across all 
such sparrh profiles P.sub.s, where r is computed as an affine function of the 
cluster diameter of cluster C. The slope and/or intercept of this affine function 
are chosen to be smaller (thereby increasing w(Si, C) ) for servers Si for which the 
target object provider wishes to improve performance, as may be the case if the 
users in the user base of proxy server Si pay a premium for improved performance, or 
if performance at Si will otherwise be unacceptably low due to slow network 
connections. 

Detailed Description Text (258) : 

5. The multicast tree MT(C) is computed by standard methods to be the minimum 
spanning tree (or a near-minimum spanning tree) for G(C), where the weight of an 
edge between two core servers is taken to b e the mat of transmitting a message 
between those two core servers. Note that MT(C) does not contain as vertices all 
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but only the core servers fo; 




.uster C. 



npt-ailgd Description Text (272) : 

When multiple versions of a file F exist on local servers, throughout the data 
communication network N, but are not marked as alternate versions of the same file, 
the system's ability to rapidly locate files similar to F (by treating them as 
target objects and applying the methods disclosed in "Searching for Target Objects" 
above) makes it possible to find all the alternate versions, even if they are stored 
remotely. These related data files may then be reconciled by any method. In a simple 
instantiation, all versions of the data file would be replaced with the version that 
had the latest date or version number. In another instantiation, each version would 
be automatically annotated with references or pointers to the other versions. 

TV>j-a-i1pr l DPHrripf -ion Text (298) : 

The filtering technology of the news clipping service is not limited to news 
articles provided by a single source, but may be extended to articles or target 
objects collected from any number of sources. For example, rather than identifying 
new news articles of interest, the technology may identify new or updated World Wide 
Web pages of interest. In a second application, termed "broadcast clipping," where 
individual users desire to broadcast messages to all interested users, the pool of 
news articles is replaced by a pool of messages to be broadcast, and these messages 
are sent to the broadcast -clipping -service subscribers most interested in them. In a 
third application, the system scans the transcripts of all real-time spoken or 
written discussions on the network that are currently in progress and designated as 
public, and employs the news-clipping technology to rapidly identify discussions 
that the user may be interested in joining, or to rapidly identify and notify users 
who may be interested in joining an ongoing discussion. In a fourth application, the 
method is used as a post-process that filters and ranks in order of interest the 
many target objects found by a conventional database search, such as a search for 
all homes selling for under $200,000 in a given area, for all 1994 news articles 
about Marcia Clark, or for all Italian- language films. In a fifth application, the 
method is used to filter and -rank the links in a hypertext document by estimating 
the user's interest in the document or other object associated with each link. In a 
sixth application, paying advertisers, who may be companies or individuals, are the 
source of advertisements or other messages, which take the place of the news 
articles in the news clipping service. A consumer who buys a product is deemed to 
have provided positive relevance feedback on advertisements for that product, and a 
consumer who buys a product apparently because of a particular advertisement (for 
example, by using a coupon clipped from that advertisement) is deemed to have 
provided particularly high relevance feedback on that advertisement. Such feedback 
may be communicated to a proxy server by the consumer's client processor (if the 
consumer is making the purchase electronically) , by the retail vendor, or by the 
credit-card reader (at the vendor's establishment) that the consumer uses to pay for 
the purchase. Given a database of such relevance feedback, the disclosed technology 
is then used to match advertisements with those users who are most interested in 
them; advertisements selected for a user are presented to that user by any one of 
several means, including electronic mail, automatic display on the users screen, or 
printing them on a printer at a retail establishment where the consumer is paying 
for a purchase. The threshold distance used to identify interest may be increased 
for a particular advertisement, causing the system to present that advertisement to 
more users, in accordance with the amount that the advertiser is willing to pay. 
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ART-UNIT: 262 

PRIMARY -EXAMINER: Peng; John K. 

ASSISTANT -EXAMINER: Miller; John W. 

ATTY- AGENT- FIRM: Duft, Graziano & Forest, P.C. 



ABSTRACT : 

This invention relates to customized electronic identification of desirable objects, 
such as news articles, in an electronic media environment, and in particular to a 
system that automatically constructs both a "target profile" for each target object 
in the electronic media based, for example, on the frequency with which each word 
appears in an article relative to its overall frequency of use in all articles, as 
well as a "target profile interest summary" for each user, which target profile 
interest summary describes the user's interest level in various types of target 
objects. The system then evaluates the target profiles against the users* target 
profile interest summaries to generate a user-customized rank ordered listing of 
target objects most likely to be of interest to each user so that the user can 
select from among these potentially relevant target objects, which were 
automatically selected by this system from the plethora of target objects that are 
profiled on the electronic media. Users' target profile interest summaries can be 
used to efficiently organize the distribution of information in a large scale system 
consisting of many users interconnected by means of a communication network. 
Additionally, a cryptographically-based pseudonym proxy server is provided to ensure 
the privacy of a user's target profile interest summary, by giving the user control 
over the ability of third parties to access this summary and to identify or contact 
the user. 

22 Claims, 17 Drawing figures 
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A system and method for customized advertisement selection and delivery on the World 
Wide Web (WWW) upon the Internet. The advertising system has a database server which 
stores advertisements and their campaign information, and an advertisement server 
which generates electronic advertisements available to a client system. In the 
system, a customization process which customized the electronic advertisements to be 
delivered to each client system is performed. A user connects to a web site and is 
presented with an editorial page or a list of search results. The system inserts a 
customized advertisement into the page that matches the page content or search 
topic. No identifiable data is collected during the interaction with the user. 
Advertisers can specify display constraints for each advertisement. The system will 
adapt all unrestricted parameters in order to maximize the user's click-through 
probability. 

20 Claims, 13 Drawing figures 
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Detailed Deanription Text (6) : 

FIG. 3 shows a flow chart of the script that handles requests for an advertisement 
image. Upon invocation by the web server in step 1001, the script will first decode 
the parameters that have been passed to the script in step 1002. The Common Gateway 
Interface (CGI) (as defined by the NCSA) is a standard protocol that allows client 
and server applications to exchange data over HTTP. CGI is implemented in almost all 
common web server implementations today, but persons skillful in the art will 
realize that it is easy to provide a custom implementation with similar support. The 
selection process can be shortcut with explicitly requesting a particular 
advertisement by its advertisement ID in step 1003. Otherwise the system tries to 
detect customization parameters in the request in step 1005. In the example shown in 
FIG. 11, the user's search word is a customization parameter, but it could also be a 
page ID or the name of the user's browser software. In case such information has 
been embedded into the request, the system will call the selection module 1006 to 
select a customized advertisement for the particular situation. If neither 
advertisement ID nor customization parameters are present, the system will simply 
obtain a list of currently active advertisements (i.e. advertisements that feature 
display constraints which do not prevent them from being shown under the current 
conditions) in step 1007 and select the advertisement with the highest required 
impression rate in step 1008. The impression rate of an advertisement is simply the 
number of times it should be shown on a certain web page or within a certain web 
site, and the amount of time left in the period it should be displayed in. This 
information is usually given by the advertiser and needs to be within the limits of 
total page accesses to the publisher's web site the advertisement will be shown on. 
Once an advertisement id has been determined in step 1003 and the procedure of 
selection module 1006 has been performed, the system can then call the advertisement 
data module 115 (FIG. 2) for obtaining the actual image data in step 1009. After 
returning this information to the client at step 1010 (this of course involves 
adhering to proper CGI output specification) the system will log the impression of 
the particular advertisement in step 1011 and the customization parameter used (if 
any) . In case no advertisement ID had been explicitly specified (step 1012) , 
additional bookkeeping is necessary to synchronize click- throughs at a later time. 
Should some form of session ID be embedded in the request in step 1013, the system 
assumes that the corresponding hyperlink has the same session ID embedded (this 
would be done on the publisher's side) and simply logs the advertisement ID under 
the particular session ID in step 1014. Otherwise the system has to use other means 
of identifying a user, such as the IP (Internet Protocol) address of the connection, 
and log the display accordingly in step 1015. Persons skillful in the art will 
realize that other forms of identification can be used instead, such as an 
advertisement server assigned user ID transmitted via cookies. After storing log 
information, execution ends at step 1016. 



lof 1 11/28/03 2:25 P* 




End of Result Set 



O I Generate Collection j [ Print 



Lll: Entry 1 of 1 



File: USPT 



Oct 28, 2003 



DOCUMENT- IDENTIFIER: US 6640218 Bl 

TITLE: Estimating the usefulness of an item in a collection of information 
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One context in which selection of items from a collection of information (e.g., a 
database) is useful is a "search engine." A typical search engine takes an 
alphanumeric query from a user (a " search string") and returns to the user a li st of 
one or more items from the database that satisfy some or all of the criteria 
specified in the query. 

n^ha-Mgr l Dpsrript-ion Text (11) : 

Referring now to FIG. 1, a computer system 100 for rating search results may include 
a user workstation 110, a search engine 120, a database 130, a query log 140 and a 
click log 150. In some embodiments, user workstation 110 is a general purpose 
computer workstation including a keyboard 112, a video display 114, a pointing 
device (e.g. a mouse) 116, and a web browser software program 118. Search engine 120 
may be a computer programmed with software that is capable of communicating, 
directly or indirectly, with user workstation 110 and of accessing database 130, 
query log 140, and click log 150. In particular, search engine 120 is capable of 
receiving a sparrh query entered by a user through web browser 118 and of displaying 
to the user, through web browser 118, 1 -i st.s of items in database 130 that satisfy 
criteria in the .qparrh query. The lists are displayed in web browser 118 in 
hypertext format so that a user may use pointing device 116 to request that selected 
items from a list be displayed in web browser 118. 

Detailed Descr n pt- i on Tf*xt (16) : 

Next, the sparrh engine 120 searches the database 130 for items 132 that match 
criteria specified in the normalized query and creates a list of matching items 

(step 230) . The search engine then applies a relevance metric to each of the 
matching items to produce a relevance score (with respect to the particular query) 

(step 240) . The relevance score may be determined by applying any known or 
subsequently developed metric that compares one or more intrinsic characteristics of 
an item with one or more criteria in a search query, for example those described in 
Manning and Schuitze, "Foundations of Statistical Natural Language Processing", MIT 
Press, Cambridge, Mass. (1999) pp. 529-574 and U.S. Pat. No. 6,012,053. After the 
relevance metric has been applied to each of the matching items, the list of 
matching items is reordered so that the items with higher relevance scores are 
placed in lower numbered rank positions (i.e., closer to the beginning of the list) 

(step 250) . 

npha-ilpd Pp.QirT-ipt--inn Text (17) : 

After the 1 1 at-, of matching items has been reordered according to relevance scores, 
ffparrh engine 120 displays the 1 i st to the user through the web browser 118 (step 
260) . In some embodiments, Rparrh engine 12 0 will initially display a web page that 
includes only the lowest ranked items in the 1 i sf. (i.e. those having the highest 
relevance factors) , displayed in rank order, and allow the user to request the 
display of additional web pages that display successively higher ranked (i.e., less 
relevant) items. Each item is displayed with a title,, a squib, and a hyperlink that 
enables the user to click on the item to display the underlying information resource 
it describes. The hyperlinks in the displayed pages are configured so when the user 
clicks on a particular hyperlink to select one of the displayed items, the user's 
web browser transmits an HTTP request to the search engine to display the underlying 
information resource described by the item. For example, if a displayed item 
describes a particular web page, clicking on the associated hyperlink will cause a 
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request to display that web page to be sent to search eri^He 120. Requests are sent 
to the search engine, rather than to the web server on which the underlying resource 
is located, to permit the search engine to keep track of what requests are made in 
response to the results- of particular queries. Once received at the search engine, 
these requests are processed (as described below) and forwarded to the appropriate 
web server. (In non-Web based embodiments, the underlying information resource may 
be a record from database 130, which can be retrieved by search engine 120 
directly. ) 

Del-ailed Description Text (18) : 

The l i fit of items displayed in order of relevance score will be referred to as a 
"relevance 1 -i at-. . 11 In addition to creating and displaying the "relevance list" for a 
particular query, sparrh engine 120 also creates and displays a separate "popularity 
1 -S fit " for the received query (step 265) . The popularity list includes popular items 
that have been previously selected by users in response to the same normalized query 
in the past. As with the items on the "relevance" list, the items on the 
"popularity" list each include a title and squib and a hyperlink enabling a user to 
access the underlying information resource. In some embodiments, the popularity list 
is displayed in the web browser simultaneously with the relevance list (i.e. in 
different parts of the same Web page) . The steps taken to create the popularity list 
will be described below. 

Detailed Description Text (19) : 

Referring now to FIG. 3, the following steps are taken when a user selects one of 
the items displayed in either the relevance list or the popularity list. The search 
engine receives the selection request (step 270) and creates a click record 180 in 
the click log 150, which includes the URL 182 of the item, along with the normalized 
query 184 and the rank 186 of the item in the relevance list displayed to the user 
(the "relevance rank" of the item with respect to the query) (step 280) . (In some 
embodiments, even if the item is selected from the popularity list, the rank 
recorded in the click log is the relevance rank.) The search engine then redirects 
the user's request to the URL of the underlying information resource, using standard 
HTTP techniques known to those of skill in the art, which causes the underlying 
information resource to be displayed on the user's web browser (step 2 90). Because a 
user may wish to select more than one of the items displayed by the web browser, 
steps 270 through 290 may be repeated as many times as the user clicks on items in 
the list of matching items. 

Detailed npfirn pfinn Text (29) : 

Referring now to FIG. 6, the following steps may be used to determine the Actual 
Pooled Popularity value for an item. First, the number of click records 180 in click 
log 150 that include the item are counted, and the Actual Pooled Popularity is set 
equal to that number (step 500) . This number indicates the number of times that the 
item was selected by a user in response to any query. (In some embodiments, certain 
clicks originating from outside of the normal search engine interface are not 
counted. For example, certain click records may reflect "clicks" that are made 
through a metacrawler program. Such programs can query a number of search engines 
and then display a combined output of those sp^rrh engines in a single 1 i st . When a 
user selects a displayed item by clicking on it, the request may be forwarded back 
to the sparrh engine, thus counting as a "click." It may be useful to disregard such 
"clicks" because they do not represent clicks from a relevance ranked list produced 
by the search engine . ) 

Detai led Description Text (42) : 

Also, some or all of the Actual Pooled Popularity values, Predicted Pooled 
Popularity values, Predicted Selection Rates and Quality Adjusted Selection Rates 
for particular combinations of queries and items can be calculated in advance of 
their being needed. For example, if the contents of the query log and cl i.ck log were 
1 -im-i ted to a "data snap shot" as of a certain time, the Actual Pooled Popularity 
values, Predicted Pooled Popularity values, Predicted Selection Rates and Quality 
Adjusted Selection Pates for all combinations of queries and items reflected in the 
ol -i r.k log could be calculated at that time, and stored in a separate database for 
use in generating popularity lists in real time. Alternatively, the "popularity 
list" for each query reflected in the click log could also be determined at the time 
the Quality Adjusted Selection Rates are determined. 

Detailed Description Text (43) : 

The search engine may display in the relevance 1 ist only those items that do not 
appear in the popularity list . Also, in some embodiments the search engine may not 
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display a separate relevance 1 i st-. and popularity list, bu^^nay instead display a 
single 1 i at. ordered according to relative Quality Adjusted Selection Rates. In such 
an embodiment, items in the relevance list that did not have an Actual Pooled 
Popularity value (because they had never before been selected by a user) could be * 
assigned a Quality Adjusted Selection Rate equal to the value of the Selection Rate 
Predictor function (i.e. the expected selection rate in the absence of any 
information about quality) . 

Df* tailed D^firrription Text (47) : 

The search engine is not limited to searching based on queries entered by users. For 
example, the_££arch engine could Rparrh for items based on a user profile (e.g. a 
1 i fit, of topics of interest to the user, demographic information about the user, or 
prior selection patterns of the user) , or other contextual information. The 
Selection Rate Predictor function would then be a function of a measure of the 
relevance of an item with respect to the user profile or other contextual 
information. 
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